DILUCT: An Open-Source Spanish Dependency Parser Based on Rules, Heuristics, and Selectional Preferences
نویسندگان
چکیده
A method for recognizing syntactic patterns for Spanish is presented. This method is based on dependency parsing using heuristic rules to infer dependency relationships between words, and word co-occurrence statistics (learnt in an unsupervised manner) to resolve ambiguities such as prepositional phrase attachment. If a complete parse cannot be produced, a partial structure is built with some (if not all) dependency relations identified. Evaluation shows that in spite of its simplicity, the parser’s accuracy is superior to the available existing parsers for Spanish. Though certain grammar rules, as well as the lexical resources used, are specific for Spanish, the suggested approach is language-independent. * This work was done under partial support of Mexican Government (SNI, CGPI-IPN, COFAA-IPN, and PIFI-IPN). The authors cordially thank Jordi Atserias for providing the data on the comparison of TACAT parser with our system.
منابع مشابه
Automatic Semantic Role Labeling using Selectional Preferences with Very Large Corpora
OF PhD THESIS Automatic Semantic Role Labeling using Selectional Preferences with Very Large Corpora Determinación Automática de Roles Semánticos usando Preferencias de Selección sobre Corpus muy Grandes Graduated: Hiram Calvo Center for Research in Computing (CIC) National Polytechnic Institute (IPN) Mexico City, Mexico, 07738 [email protected] [email protected] Graduated on June 19th, 2006...
متن کاملDomain Adaptation of a Dependency Parser with a Class-Class Selectional Preference Model
When porting parsers to a new domain, many of the errors are related to wrong attachment of out-of-vocabulary words. Since there is no available annotated data to learn the attachment preferences of the target domain words, we attack this problem using a model of selectional preferences based on domainspecific word classes. Our method uses Latent Dirichlet Allocations (LDA) to learn a domain-sp...
متن کاملSpanish FreeLing Dependency Grammar
This paper presents the development of an open-source Spanish Dependency Grammar implemented in FreeLing environment. This grammar was designed as a resource for NLP applications that require a step further in natural language automatic analysis, as is the case of Spanish-to-Basque translation. The development of wide-coverage rule-based grammars using linguistic knowledge contributes to extend...
متن کاملDILUCT: Automatic Semantic Role Labeling using Selectional Preferences with Very Large Corpora Determinación automática de roles semánticos usando preferencias de selección sobre corpus muy grandes
We present a method for recognizing semantic roles for Spanish sentences. This method is based on dependency parsing using heuristic rules to infer dependency relationships between words, and word cooccurrence statistics (learnt in an unsupervised manner) to resolve ambiguities such as prepositional phrase attachment. If a complete parse cannot be produced, a partial structure is built with som...
متن کاملGenerative Modeling of Coordination by Factoring Parallelism and Selectional Preferences
We present a unified generative model of coordination that considers parallelism of conjuncts and selectional preferences. Parallelism of conjuncts, which frequently characterizes coordinate structures, is modeled as a synchronized generation process in the generative parser. Selectional preferences learned from a large web corpus provide an important clue for resolving the ambiguities of coord...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006